ACM ASIACCS 2021

Session 7A

Privacy (II)

Conference

10:30 AM — 11:50 AM HKT

Local

Jun 9 Wed, 7:30 PM — 8:50 PM PDT

Cryptographic Key Derivation from Biometric Inferences for Remote Authentication

Erkam Uzun (Georgia Institute of Technology, USA), Carter Yagemann (Georgia Institute of Technology, USA), Simon Chung (Georgia Institute of Technology, USA), Vladimir Kolesnikov (Georgia Institute of Technology, USA), Wenke Lee (Georgia Institute of Technology, USA)

2

Biometric authentication is getting increasingly popular because of its appealing usability and improvements in biometric sensors. At the same time, it raises serious privacy concerns since the common deployment involves storing bio-templates in remote servers. Current solutions propose to keep these templates on the client’s device, outside the server’s reach. This binds the client to the initial device. A more attractive solution is to have the server authenticate the client, thereby decoupling them from the device. Unfortunately, existing biometric template protection schemes either suffer from the practicality or accuracy. The state-of-the-art deep learning (DL) solutions solve the accuracy problem in faceand voice-based verification. However, existing privacy-preserving methods do not accommodate the DL methods, as they are tailored to hand-crafted feature space of specific modalities in general. In this work, we propose a novel pipeline, Justitia, that makes DL-inferences of face and voice biometrics compatible with the standard privacy-preserving primitives, like fuzzy extractors (FE). For this, we first form a bridge between Euclidean (or cosine) space of DL and Hamming space of FE, while maintaining the accuracy and privacy of underlying schemes.We also introduce efficient noise handling methods to keep the FE scheme practically applicable. We implement an end-to-end prototype to evaluate our design, then show how to improve the security for sensitive authentications and usability for non-sensitive, day-to-day, authentications. Justitia achieves the same, 0.33% false rejection at zero false acceptance, errors as the plaintext baseline does on the YouTube Faces benchmark. Moreover, combining face and voice achieves 1.32% false rejection at zero false acceptance. According to our systematical security assessments conducted through prior approaches and our novel black-box method, Justitia achieves ~25 bits and ~33 bits of security guarantees for face- and face&voice-based pipelines, respectively.

Understanding the Privacy Implications of Adblock Plus‚Äôs Acceptable Ads

Ahsan Zafar (North Carolina State University, USA), Aafaq Sabir (North Carolina State University, USA), Dilawer Ahmed (North Carolina State University, USA), Anupam Das (North Carolina State University, USA)

0

Targeted advertisement is prevalent on the Web. Many privacyenhancing tools have been developed to thwart targeted advertisement. Adblock Plus is one such popular tool, used by millions of users on a daily basis, to block unwanted ads and trackers. Adblock Plus uses EasyList and EasyPrivacy, the most prominent and widely used open-source filters, to block unwanted web contents. However, Adblock Plus, by default, also enables an exception list to unblock web requests that comply with specific guidelines defined by the Acceptable Ads Committee. Any publisher can enroll into the Acceptable Ads initiative to request the unblocking of web contents. Adblock Plus in return charges a licensing fee from large entities, who gain a significant amount of ad impressions per month due to participation in the Acceptable Ads initiative. However, the privacy implications of the default inclusion of the exception list has not been well studied, especially as it can unblock not only ads, but also trackers (e.g., unblocking contents otherwise blocked by EasyPrivacy). In this paper, we take a data-driven approach, where we collect historical updates made to Adblock Plus’s exception list and real-world web traffic by visiting the top 10k websites listed by Tranco. Using such data we analyze not only how the exception list has evolved over the years in terms of both contents unblocked and partners/entities enrolled into the Acceptable Ads initiative, but also the privacy implications of enabling the exception list by default. We found that Google not only unblocks the most number of unique domains, but is also unblocked by the most number of unique partners. From our traffic analysis, we see that of the 42,210 Google bound web requests, originally blocked by EasyPrivacy, around 80% of such requests are unblocked by the exception list. More worryingly, many of the requests enable 1-by-1 tracking pixel images.We, therefore, question exception rules that negate EasyPrivacy filtering rules by default and advocate for a better vetting process.

Privacy-preserving Density-based Clustering

Beyza Bozdemir (Eurecom, France), Sébastien Canard (Orange Labs, France), Orhan Ermis (Eurecom, France), Helen Möllering (Technical University of Darmstadt, Germany), Melek Önen (Eurecom, France), Thomas Schneider (Technical University of Darmstadt, Germany)

1

Clustering is an unsupervised machine learning technique that outputs clusters containing similar data items. In this work, we investigate privacy-preserving density-based clustering which is, for example, used in financial analytics and medical diagnosis. When (multiple) data owners collaborate or outsource the computation, privacy concerns arise. To address this problem, we design, implement, and evaluate the first practical and fully private density-based clustering scheme based on secure two-party computation. Our protocol privately executes the DBSCAN algorithm without disclosing any information (including the number and size of clusters). It can be used for private clustering between two parties as well as for private outsourcing of an arbitrary number of data owners to two non-colluding servers. Our implementation of the DBSCAN algorithm privately clusters data sets with 400 elements in 7 minutes on commodity hardware. Thereby, it flexibly determines the number of required clusters and is insensitive to outliers, while being only factor 19x slower than today’s fastest private K-means protocol (Mohassel et al., PETS’20) which can only be used for specific data sets. We then show how to transfer our newly designed protocol to related clustering algorithms by introducing a private approximation of the TRACLUS algorithm for trajectory clustering which has interesting real-world applications like financial time series forecasts and the investigation of the spread of a disease like COVID-19.

DySan: Dynamically sanitizing motion sensor data against sensitive inferences through adversarial networks

Theo Jourdan (Insa-Lyon, CITI, Inria, France), Antoine Boutet (Insa-Lyon, CITI, Inria, France), Carole Frindel (Insa-Lyon, Creatis, Inserm, France), Claude Rosin Ngueveu (UQAM, Canada), Sebastien Gambs (UQAM, Canada)

0

With the widespread development of the quantified-self movement, an increasing number of users rely on mobile applications to monitor their physical activity through their smartphones. However, granting applications a direct access to sensor data exposes users to privacy risks. In particular, motion sensor data are usually transmitted to analytics applications hosted in the cloud, which leverages on machine learning models to provide feedback on their activity status to users. In this setting, nothing prevents the service provider to infer private and sensitive information about a user such as health or demographic attributes. To address this issue, we propose DySan, a privacy-preserving framework to sanitize motion sensor data against unwanted sensitive inferences (i.e., improving privacy) while limiting the loss of accuracy on the physical activity monitoring (i.e., maintaining data utility). Our approach is inspired from the framework of Generative Adversarial Networks to sanitize the sensor data for the purpose of ensuring a good trade-off between utility and privacy. More precisely, by learning in a competitive manner several networks, DySan is able to build models that sanitize motion data against inferences on a specified sensitive attribute (e.g., gender) while maintaining an accurate activity recognition. DySan builds various sanitizing models, characterized by different sets of hyperparameters in the global loss function, to propose a transfer learning scheme over time by dynamically selecting the model which provides the best utility and privacy trade-off according to the incoming data. Experiments conducted on real datasets demonstrate that DySan can drastically limit the gender inference up to 41% (from 98% with raw data to 57% with sanitized data) while only reducing the accuracy of activity recognition by 3% (from 95% with raw data to 92% with sanitized data).

Session Chair

Sherman S. M. Chow

Session 8A

Malware and Cybercrime (I)

Conference

2:00 PM — 4:00 PM HKT

Local

Jun 9 Wed, 11:00 PM — 1:00 AM PDT

Malware Makeover: Breaking ML-based Static Analysis by Modifying Executable Bytes

Keane Lucas (Carnegie Mellon University, USA), Mahmood Sharif (VMware and TAU, Israel), Lujo Bauer (Carnegie Mellon University, USA), Michael K. Reiter (Duke University, USA), Saurabh Shintre (NortonLifeLock Research Group, USA)

4

Motivated by the transformative impact of deep neural networks (DNNs) in various domains, researchers and anti-virus vendors have proposed DNNs for malware detection from raw bytes that do not require manual feature engineering. In this work, we propose an attack that interweaves binary-diversification techniques and optimization frameworks to mislead such DNNs while preserving the functionality of binaries. Unlike prior attacks, ours manipulates instructions that are a functional part of the binary, which makes it particularly challenging to defend against. We evaluated our attack against three DNNs in white- and black-box settings, and found that it often achieved success rates near 100%. Moreover, we found that our attack can fool some commercial anti-viruses, in certain cases with a success rate of 85%. We explored several defenses, both new and old, and identified some that can foil over 80% of our evasion attempts. However, these defenses may still be susceptible to evasion by attacks, and so we advocate for augmenting malware-detection systems with methods that do not rely on machine learning.

Identifying Behavior Dispatchers for Malware Analysis

Kyuhong Park (Georgia Institute of Technology, USA), Burak Sahin (Georgia Institute of Technology, USA), Yongheng Chen (Georgia Institute of Technology, USA), Jisheng Zhao (Rice University, USA), Evan Downing (Georgia Institute of Technology, USA), Hong Hu (The Pennsylvania State University, USA), Wenke Lee (Georgia Institute of Technology, USA)

5

Malware is a major threat to modern computer systems. Malicious behaviors are hidden by a variety of techniques: code obfuscation, message encoding and encryption, etc. Countermeasures have been developed to thwart these techniques in order to expose malicious behaviors. However, these countermeasures rely heavily on identifying specific API calls, which has significant limitations as these calls can be misleading or hidden from the analyst. In this paper, we show that malicious programs share a key component which we call a behavior dispatcher, a code structure which is intercepted between various condition checks and malicious actions. By identifying these behavior dispatchers, a malware analysis can be guided into behavior dispatchers and activate hidden malicious actions more easily. We propose BDHunter, a system that automatically identifies behavior dispatchers to assist triggering malicious behaviors. BDHunter takes advantage of the observation that a dispatcher compares an input with a set of expected values to determine which malicious behaviors to execute next. We evaluate BDHunter on recent malware samples to identify behavior dispatchers and show that these dispatchers can help trigger more malicious behaviors (otherwise hidden). Our experimental results show that BDHunter identifies 77.4% of dispatchers within the top 20 candidates discovered. Furthermore, BDHunter-guided concolic execution successfully triggers 13.0× and 2.6× more malicious behaviors, compared to unguided symbolic and concolic execution, respectively. These demonstrate that BDHunter effectively identifies behavior dispatchers, which are useful for exposing malicious behaviors.

MalPhase: Fine-Grained Malware Detection Using Network Flow Data

Michal Piskozub (University of Oxford, United Kingdom), Fabio De Gaspari (Sapienza University of Rome, Italy), Freddie Barr-Smith (University of Oxford, United Kingdom), Luigi Mancini (Sapienza University of Rome, Italy), Ivan Martinovic (University of Oxford, United Kingdom)

3

Economic incentives encourage malware authors to constantly develop new, increasingly complex malware to steal sensitive data or blackmail individuals and companies into paying large ransoms. In 2017, the worldwide economic impact of cyberattacks is estimated to be between 445 and 600 billion USD, or 0.8% of global GDP1. Traditionally, one of the approaches used to defend against malware is network traffic analysis, which relies on network data to detect the presence of potentially malicious software. However, to keep up with increasing network speeds and amount of traffic, network analysis is generally limited to work on aggregated network data, which is traditionally challenging and yields mixed results. In this paper we present MalPhase, a system that was designed to cope with the limitations of aggregated flows. MalPhase features a multiphase pipeline for malware detection, type and family classification. The use of an extended set of network flow features and a simultaneous multi-tier architecture facilitates a performance improvement for deep learning models, making them able to detect malicious flows (> 98% F1) and categorize them to a respective malware type (> 93% F1) and family (> 91% F1). Furthermore, the use of robust features and denoising autoencoders allows MalPhase to perform well on samples with varying amounts of benign traffic mixed in. Finally, MalPhase detects unseen malware samples with performance comparable to that of known samples, even when interlaced with benign flows to reflect realistic network environments.

Session Chair

Sang Kil Cha

Session 9A

Hardware Security (II)

Conference

4:20 PM — 5:20 PM HKT

Local

Jun 10 Thu, 1:20 AM — 2:20 AM PDT

(Mis)managed: A Novel TLB-based Covert Channel on GPUs

Ajay Nayak (Indian Institute of Science, India), Pratheek B (Indian Institute of Science, India), Vinod Ganapathy (Indian Institute of Science, India), Arkaprava Basu (Indian Institute of Science, India)

0

GPUs are now commonly available in most modern computing platforms. They are increasingly being adopted in cloud platforms and data centers due to their immense computing capability. In response to this growth in usage, manufacturers continuously try to improve GPU hardware by adding new features. However, this increase in usage and the addition of utility-improving features can create new, unexpected attack channels. In this paper, we show that two such features—unified virtual memory (UVM) and multi-process service (MPS)—primarily introduced to improve the programmability and efficiency of GPU kernels have an unexpected consequence—that of creating a novel covert-timing channel via the GPU’s translation lookaside buffer (TLB) hierarchy. To enable this covert channel, we first perform experiments to understand the characteristics of TLBs present on a GPU. The use of UVM allows fine-grained management of translations, and helps us discover several idiosyncrasies of the TLB hierarchy, such as three-levels of TLB, coalesced entries. We use this newly-acquired understanding to demonstrate a novel covert channel via the shared TLB. We then leverage MPS to increase the bandwidth of this channel by 40×. Finally, we demonstrate the channel’s utility by leaking data from a GPU-accelerated database application.

Scanning the Cycle: Timing-based Authentication on PLCs

Chuadhry Mujeeb Ahmed (University of Strathclyde, Scotland), Martin Ochoa (AppGate, USA), Jianying Zhou (Singapore University of Technology and Design, Singapore), Aditya Mathur (Singapore University of Technology and Design, Singapore)

0

Programmable Logic Controllers (PLCs) are a core component of an Industrial Control System (ICS). However, if a PLC is compromised or the commands sent across a network from the PLCs are spoofed, consequences could be catastrophic. In this work, a novel technique to authenticate PLCs is proposed that aims at raising the bar against powerful attackers while being compatible with real-time systems. The proposed technique captures timing information for each controller in a non-invasive manner. It is argued that Scan Cycle is a unique feature of a PLC that can be approximated passively by observing network traffic. An attacker that spoofs commands issued by the PLCs would deviate from such fingerprints. To detect replay attacks a PLC Watermarking technique is proposed. PLC Watermarking models the relation between the scan cycle and the control logic by modeling the input/output as a function of request/response messages of a PLC. The proposed technique is validated on an operational water treatment plant (SWaT) and smart grid (EPIC) testbeds. Results from experiments indicate that PLCs can be distinguished based on their scan cycle timing characteristics.

Transduction Shield: A Low-Complexity Method to Detect and Correct the Effects of EMI Injection Attacks on Sensors

Yazhou Tu (University of Louisiana at Lafayette, USA), Vijay Srinivas Tida (University of Louisiana at Lafayette, USA), Zhongqi Pan (University of Louisiana at Lafayette, USA), Xiali Hei (University of Louisiana at Lafayette, USA)

0

The reliability of control systems often relies on the trustworthiness of sensors. As process automation and robotics keep evolving, sensing methods such as pressure sensing are extensively used in both conventional systems and rapidly emerging applications. The goal of this paper is to investigate the threats and design a low-complexity defense method against EMI injection attacks on sensors. To ensure the security and usability of sensors and automated processes, we propose to leverage a matched dummy sensor circuit that shares the sensor’s vulnerabilities to EMI but is insensitive to legitimate signals that the sensor is intended to measure. Our method can detect and correct corrupted sensor measurements without introducing components or modules that are highly complex compared to an original low-end sensor circuit.We analyze and evaluate our method on sensors with EMI injection experiments using different attack parameters. We investigate several attack scenarios, including manipulating the DC voltage of the sensor output, injecting sinusoidal signals, white noises, and malicious voice signals. Our experimental results suggest that, with relatively low cost and computation overhead, the proposed method not only detects the attack but also can correct corrupted sensor data to help maintain the functioning of systems based on different kinds of sensors in the presence of attacks.

Session Chair

Guoxing Chen

Program at a Glance

ACM ASIACCS 2021

Privacy (II)

Cryptographic Key Derivation from Biometric Inferences for Remote Authentication

Understanding the Privacy Implications of Adblock Plus‚Äôs Acceptable Ads

Privacy-preserving Density-based Clustering

DySan: Dynamically sanitizing motion sensor data against sensitive inferences through adversarial networks

Session Chair

Malware and Cybercrime (I)

Malware Makeover: Breaking ML-based Static Analysis by Modifying Executable Bytes

Identifying Behavior Dispatchers for Malware Analysis

MalPhase: Fine-Grained Malware Detection Using Network Flow Data

Session Chair

Hardware Security (II)

(Mis)managed: A Novel TLB-based Covert Channel on GPUs

Scanning the Cycle: Timing-based Authentication on PLCs

Transduction Shield: A Low-Complexity Method to Detect and Correct the Effects of EMI Injection Attacks on Sensors

Session Chair